Souvik Kundu, Ph.D.

profile.jpg

I am a Staff Research Scientist at Intel Labs leading research efforts in scalable and novel AI primitives. Prior to joining at Intel I completed my Ph.D. in Electrical & Computer Engineering from University of Southern California. I was co-advised by Dr. Massoud Pedram and Dr. Peter A. Beerel. I was fortunate to receive the outstanding Ph.D. award and the Order De Arete award with multiple prestigious fellowships. I am honored recipient of the CPAL AI Rising Star Award 2025 and Young Investigator Award 2025 conferred by CPAL-Stanford DS and International Neural Network Society (INNS), respectively. I am one of the youngest recipients of the Semiconductor Research Corporation (SRC, USA) outstanding liaison award for my impactful research in 2023. My research goal is to enable human life with robust and efficient AI services via a cross-layer innovation of algorithmic optimizations blended with existing and novel hardware compute and architectures. I have co-authored >75 peer-reviewed papers in various top-tier conferences including NeurIPS, ICLR, ICML, ACL, CVPR, DATE, and DAC with multiple Oral, young fellow, travel award, and best paper nominations. [google scholar]

I serve as the founding AC and committee member of the Conference on Parsimony and Learning (CPAL). Additionally, I serve in the AC committee for various journals and conferences including ICLR, NeurIPS (outstanding reviewer recognition'22), EMNLP (outstanding reviewer recognition'20), CVPR, DATE, and DAC.

news

May 20, 2025 :trophy: Honored and humbled to be recipient of the International Neural Network Society (INNS) Young Investigator Award 2025 !! :trophy:
May 18, 2025 :page_with_curl: :sparkles: 1x paper (On-the-Fly Adaptive Distillation of Transformer to Dual-State Linear Attention for Long-Context LLM Serving) accepted at ICML 2025, 1x paper (LAMB: A Training-Free Method to Enhance the Long-Context Understanding of SSMs via Attention-Guided Token Filtering) accepted at ACL 2025, 1x paper (Accelerating LLM Inference with Flexible N:M Sparsity via A Fully Digital Compute-in-Memory Accelerator) accepted at ISLPED 2025! :page_with_curl: :sparkles:
Apr 09, 2025 I will be serving as an Area Chair (AC) at NeurIPS 2025!
Mar 26, 2025 :trophy: Gave an invited talk at Stanford Data Science Department as a part of my recognition as one of the CPAL AI Rising Stars 2025! :trophy:
Mar 25, 2025 Code for LANTERN (here: lantern-code), our ICLR 2025 paper is open-sourced now! We are glad to announce that LANTERN++, an extension of the work got Oral presentation at SCOPE - ICLR 2025 workshop. :sparkles:
Mar 21, 2025 :page_with_curl: :sparkles: 1x paper on Microscaling quantization (MicroScopiQ) got accepted at ISCA 2025! :page_with_curl: :sparkles:
Mar 06, 2025 I will be serving as an Area Chair at ACL 2025 and Track chair for AIML track at IEEE COINS 2025

selected publications

  1. ISCA 2025
    MicroScopiQ: Accelerating Foundational Models through Outlier-Aware Microscaling Quantization
    Akshat Ramachandran, Souvik Kundu, and Tushar Krishna
    In International Symposium on Computer Architecture (ISCA), 2025
  2. ICLR 2025
    MambaExtend: A Training-Free Approach to Improve Long Context Extension of Mamba
    Souvik Kundu*, Seyedarmin Azizi*, Mohammad Erfan Sadeghi, and 1 more author
    In International Conference on Learning Representations (ICLR), 2025
  3. NeurIPS 2024
    GEAR: An efficient kv cache compression recipefor near-lossless generative inference of llm
    Hao Kang, Qingru Zhang, Souvik Kundu, and 4 more authors
    In Thirty-Eighth Annual Conference on Neural Information Processing Systems Workshop (Spotlight), 2024
  4. ICLR 2023
    Learning to linearize deep neural networks for secure and efficient private inference
    Souvik Kundu, Shunlin Lu, Yuke Zhang, and 2 more authors
    In International Conference on Learning Representation, 2023
  5. NeurIPS 2021
    Analyzing the confidentiality of undistillable teachers in knowledge distillation
    Souvik Kundu, Qirui Sun, Yao Fu, and 2 more authors
    In Advances in Neural Information Processing Systems, 2021
  6. WACV 2021
    Spike-thrift: Towards energy-efficient deep spiking neural networks by limiting spiking activity via attention-guided compression
    Souvik Kundu, Gourav Datta, Massoud Pedram, and 1 more author
    In Proceedings of the IEEE/CVF winter conference on applications of computer vision, 2021
  7. ASP-DAC 2021
    DNR: A tunable robust pruning framework through dynamic network rewiring of dnns
    Souvik Kundu, Mahdi Nazemi, Peter A Beerel, and 1 more author
    In Proceedings of the 26th Asia and South Pacific Design Automation Conference, 2021
  8. IEEE TC 2020
    Pre-defined sparsity for low-complexity convolutional neural networks
    Souvik Kundu, Mahdi Nazemi, Massoud Pedram, and 2 more authors
    In IEEE Transactions on Computers, 2020